Prior preference learning from experts: Designing a reward with active inference
نویسندگان
چکیده
Active inference may be defined as Bayesian modeling of a brain with biologically plausible model the agent. Its primary idea relies on free energy principle and prior preference An agent will choose an action that leads to its for future observation. In this paper, we claim active can interpreted using reinforcement learning (RL) algorithms find theoretical connection between them. We extend concept expected (EFE), which is core quantity in inference, EFE treated negative value function. Motivated by connection, propose simple but novel method from experts. This illustrates problem inverse RL approached new perspective inference. Experimental results show possibility EFE-based rewards application problem.
منابع مشابه
Active Preference-Based Learning of Reward Functions
Our goal is to efficiently learn reward functions encoding a human’s preferences for how a dynamical system should act. There are two challenges with this. First, in many problems it is difficult for people to provide demonstrations of the desired system trajectory (like a high-DOF robot arm motion or an aggressive driving maneuver), or to even assign how much numerical reward an action or traj...
متن کاملDopamine, reward learning, and active inference
Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We ...
متن کاملActive Reward Learning from Critiques
Learning from demonstration algorithms, such as Inverse Reinforcement Learning, aim to provide a natural mechanism for programming robots, but can often require a prohibitive number of demonstrations to capture important subtleties of a task. Rather than requesting additional demonstrations blindly, active learning methods leverage uncertainty to query the user for action labels at states with ...
متن کاملActive learning with a misspecified prior
We study learning and information acquisition by a Bayesian agent whose prior belief is misspecified in the sense that it assigns probability zero to the true state of the world. At each instant, the agent takes an action and observes the corresponding payoff, which is the sum of a fixed but unknown function of the action and an additive error term. We provide a complete characterization of asy...
متن کاملPreference Inference through Rescaling Preference Learning
One approach to preference learning, based on linear support vector machines, involves choosing a weight vector whose associated hyperplane has maximum margin with respect to an input set of preference vectors, and using this to compare feature vectors. However, as is well known, the result can be sensitive to how each feature is scaled, so that rescaling can lead to an essentially different ve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2022
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2021.12.042